Applying Discrete PCA in Data Analysis

نویسندگان

  • Wray L. Buntine
  • Aleks Jakulin
چکیده

Methods for analysis of principal components in discrete data have existed for some time under various names such as grade of membership modelling, probabilistic latent semantic analysis, and genotype inference with admixture. In this paper we explore a number of extensions to the common theory, and present some application of these methods to some common statistical tasks. We show that these methods can be interpreted as a discrete version of ICA. We develop a hierarchical version yielding components at different levels of detail, and additional techniques for Gibbs sampling. We compare the algorithms on a text prediction task using support vector machines, and to information retrieval.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Topic-Specific Link Analysis using Independent Components for Information Retrieval

There has been mixed success in applying semantic component analysis (LSA, PLSA, discrete PCA, etc.) to information retrieval. Previous experiments have shown that high-fidelity language models do not imply good quality retrieval. Here we combine link analysis with discrete PCA (a semantic component method) to develop an auxiliary score for information retrieval that is used in post-filtering d...

متن کامل

Feature decorrelation methods in speech recognition. a comparative study

In this paper we study various decorrelation methods for the features used in speech recognition and we compare the performance of each one by running several tests with a speech database. First of all we study the Principal Components Analysis (PCA). PCA extracts the dimensions along which the data vary the most, and thus it allows us to reduce the dimension of the data point without signi can...

متن کامل

The Analysis of Bayesian Probit Regression of Binary and Polychotomous Response Data

The goal of this study is to introduce a statistical method regarding the analysis of specific latent data for regression analysis of the discrete data and to build a relation between a probit regression model (related to the discrete response) and normal linear regression model (related to the latent data of continuous response). This method provides precise inferences on binary and multinomia...

متن کامل

Improving the quality of images synthesized by discrete cosines transform – regression based method using principle component analysis

  Purpose: Different views of an individuals’ image may be required for proper face recognition.   Recently, discrete cosines transform (DCT) based method has been used to synthesize virtual   views of an image using only one frontal image. In this work the performance of two different   algorithms was examined to produce virtual views of one frontal image.   Materials and Methods: Two new meth...

متن کامل

Applying dimension reduction to EEG data by principal component analysis reduces the quality of its subsequent independent component decomposition.

Independent Component Analysis (ICA) has proven to be an effective data driven method for analyzing EEG data, separating signals from temporally and functionally independent brain and non-brain source processes and thereby increasing their definition. Dimension reduction by Principal Component Analysis (PCA) has often been recommended before ICA decomposition of EEG data, both to minimize the a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004